| AD_ | <br> | <br> |
|-----|------|------|
|     |      |      |

Award Number: W81XWH-10-1-0741

TITLE: A Brain-Machine-Brain Interface for Rewiring of Cortical Circuitry after Traumatic Brain Injury

PRINCIPAL INVESTIGATOR: Pedram Mohseni, Ph.D.

CONTRACTING ORGANIZATION: Case Western Reserve University """"Cleveland, OH 44106

REPORT DATE: Ugr vgo dgt 2013

TYPE OF REPORT: Annual

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland 21702-5012

DISTRIBUTION STATEMENT:

Approved for public release; distribution unlimited

The views, opinions and/or findings contained in this report are those of the author(s) and should not be construed as an official Department of the Army position, policy or decision unless so designated by other documentation.

## **REPORT DOCUMENTATION PAGE**

Form Approved OMB No. 0704-0188

Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.

| 1. REPORT DATE (DE                                                                                                               | D-MM-YYYY)              | 2. REPORT TYPE          |                        |                                 | DATES COVERED (From - To)                  |  |
|----------------------------------------------------------------------------------------------------------------------------------|-------------------------|-------------------------|------------------------|---------------------------------|--------------------------------------------|--|
| Ugr vgo dgt '4235                                                                                                                |                         | Annual                  |                        |                                 | Sepvgo dgt '4234''/ '53'C wi wuv'4235      |  |
| 4. TITLE AND SUBTIT                                                                                                              |                         |                         |                        |                                 | CONTRACT NUMBER                            |  |
|                                                                                                                                  | rain Interface for Rev  | wiring of Cortical Circ | cuitry after Traumatic | Brain                           |                                            |  |
| Injury                                                                                                                           |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        | 5b.                             | GRANT NUMBER                               |  |
|                                                                                                                                  |                         |                         |                        | W                               | 31XWH-10-1-0741                            |  |
|                                                                                                                                  |                         |                         |                        | 5c.                             | PROGRAM ELEMENT NUMBER                     |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 6. AUTHOR(S)                                                                                                                     |                         |                         |                        | 5d.                             | PROJECT NUMBER                             |  |
| Pedram Mohseni                                                                                                                   |                         |                         |                        |                                 |                                            |  |
| "                                                                                                                                |                         |                         |                        | 5e.                             | TASK NUMBER                                |  |
| "                                                                                                                                |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        | 5f.                             | WORK UNIT NUMBER                           |  |
| Go ckn <r 0o="" gf="" qj<="" tco="" td=""><td></td><td></td><td></td><td></td><td></td></r>                                      |                         |                         |                        |                                 |                                            |  |
| 7. PERFORMING ORG                                                                                                                | GANIZATION NAME(S)      | AND ADDRESS(ES)         |                        |                                 | PERFORMING ORGANIZATION REPORT             |  |
|                                                                                                                                  |                         |                         |                        | r                               | NUMBER                                     |  |
| Case Western Reser                                                                                                               | ve University           |                         |                        |                                 |                                            |  |
| Cleveland, OH 4410                                                                                                               | •                       |                         |                        |                                 |                                            |  |
| Cicvetana, O11 4410                                                                                                              | ,0                      |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| a SPONSORING / MC                                                                                                                | NITORING AGENCY N       | AME(S) AND ADDRESS      | 2/ES)                  | 10                              | SPONSOR/MONITOR'S ACRONYM(S)               |  |
|                                                                                                                                  |                         |                         | rick, Maryland 21702-  |                                 | SPONSON/WONTON S ACTION TWI(S)             |  |
| OS 7 miny ividateur i                                                                                                            | researen ana materie.   | r communa, r ort Den    | ick, war yaard 21702   | 3012                            |                                            |  |
|                                                                                                                                  |                         |                         |                        | 11.                             | SPONSOR/MONITOR'S REPORT                   |  |
|                                                                                                                                  |                         |                         |                        |                                 | NUMBER(S)                                  |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 12. DISTRIBUTION / A                                                                                                             | VAILABILITY STATEM      | ENT                     |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| Approved for Public                                                                                                              | Release; distribution   | n unlimited             |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 13. SUPPLEMENTAR                                                                                                                 | Y NOTES                 |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 14. ABSTRACT                                                                                                                     |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 | d successfully tested for functionality in |  |
| both anesthetized and ambulatory rats. Further, in semi-chronic experiments in rats with traumatic brain injury (TBI) using this |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 | ed, as compared to control rats (injured   |  |
|                                                                                                                                  |                         |                         |                        |                                 | ome recovery after injury, but ADS is      |  |
| significantly more e                                                                                                             | fficacious, resulting i | n recovery to normal:   | ranges of performance  | within 2 weeks                  | s after injury.                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 15. SUBJECT TERMS                                                                                                                |                         |                         |                        |                                 |                                            |  |
| Anatomical rewiring; Implantable microsystem; Neuroplasticity; Rehabilitation; Traumatic brain injury                            |                         |                         |                        |                                 |                                            |  |
|                                                                                                                                  |                         |                         |                        |                                 |                                            |  |
| 16. SECURITY CLASSIFICATION OF: 17. LIMITATION 18. NU                                                                            |                         |                         | 18. NUMBER             | 19a. NAME OF RESPONSIBLE PERSON |                                            |  |
|                                                                                                                                  |                         |                         | OF ABSTRACT            | OF PAGES                        | USAMRMC                                    |  |
| a. REPORT                                                                                                                        | b. ABSTRACT             | c. THIS PAGE            | 1,,,,                  | 17                              | 19b. TELEPHONE NUMBER (include area        |  |
| U                                                                                                                                | U                       | U                       | UU                     | 1 /                             | code)                                      |  |
|                                                                                                                                  | 1                       |                         | 1                      |                                 | ,                                          |  |

## **Table of Contents**

|                              | <u>Page</u> |
|------------------------------|-------------|
| Introduction                 | 4           |
| Body                         | 4           |
| Key Research Accomplishments | 5           |
| Reportable Outcomes          | 5           |
| Conclusion                   | 5           |
| References                   | N/A         |
| Appendices                   | 7           |

## A Brain-Machine-Brain Interface for Rewiring of Cortical Circuitry after Traumatic Brain Injury

Principal Investigator: Pedram Mohseni, Ph.D.

Department of Electrical Engineering and Computer Science, Case Western Reserve University

Co-Principal Investigator: Randolph J. Nudo, Ph.D.

Department of Molecular and Integrative Physiology, Kansas University Medical Center

## Introduction

The goal of this project is to use an implantable brain-machine-brain interface to enhance behavioral recovery after traumatic brain injury (TBI) by reshaping long-range intracortical connectivity patterns. We hypothesize that artificial synchronous activation of distant cortical locations will encourage spontaneously sprouting axons to migrate toward and terminate in the coupled region, and that such directed sprouting can aid in functional recovery.

## **Body**

In this section of the annual report, we describe the research accomplishments associated with each task outlined in the approved Statement of Work.

## 1. Electronics Development

For **Tasks 1.1** and **1.2**, we decided to use the same integrated circuit (IC) previously developed for rodent studies in constructing the microsystem for non-human primate studies. This is because the capabilities of the rat IC (e.g., spike-stimulus time delay range, stimulus current parameters, etc) are deemed to be suitable for the initial round of experiments with non-human primates. Further, we already have >10 functional ICs from the original round of IC fabrication, obviating a need for re-fabricating and re-characterizing the IC for non-human primate studies.

### 2. Microsystem Packaging

For Tasks 2.1 and 2.2, NeuroNexus Technologies (Ann Arbor, MI) was identified as a reliable commercial supplier for silicon-microfabricated microprobes for recording and stimulation. Further, Flexible Circuit Technologies (Plymouth, MN) was identified as a reliable commercial supplier of miniature, rigid-flex substrates. We have also previously worked with ProtoConnect (Ann Arbor, MI) for die attachment, encapsulation, wire bonding, and assembly of all the components onto the substrate. Efforts are now focused on modifying the microsystem assembly and packaging for ambulatory experiments with non-human primates. Specifically, the goal is to fit the revised microsystem inside a custom-designed plastic chamber with internal dimensions of 18 mm × 18 mm that will be affixed to the skull of a squirrel monkey. We have also decided to move the battery and the wireless transceiver module to a backpack device (mounted on the back of the monkey) in order to further simplify the design of the microsystem inside the skull-mounted chamber. The microsystem is envisioned to connect to two multi-site, chronically implanted recording and stimulating microelectrodes (NeuroNexus Technologies, Ann Arbor, MI) via two microconnectors (Omnetics Corp., Minneapolis, MN) in plug-and-play fashion. Acrylic will be used as a biocompatible encapsulant, whenever necessary. As stated in the annual report of the Partnering PI, Prof. Randy Nudo, we have already completed the design and fabrication of the plastic chambers customized to fit the shape of the monkey skull. This was a collaborative effort between the engineering group at CWRU and the neurobiological team at KUMC.

In this section of the annual report, we describe the research accomplishments associated with tasks from previous phases as outlined in the approved Statement of Work.

Phase I (1-12 months), Task 1 (Electronics Development)

1.3 Design a neural signal processor for real-time stimulus artifact rejection using template subtraction technique with power consumption  $\leq 5 \mu W$ .

An infinite impulse response (IIR) temporal filtering technique for real-time stimulus artifact rejection (SAR) based on template subtraction was developed. A system architecture for the IIR SAR algorithm was also developed, and the operation of the algorithm with fixed-point computation was analyzed to obtain the number of bits for the internal nodes of the system, considering dynamic range and fraction length requirements for optimum performance. Further, memory initialization with the first recorded stimulus artifact was implemented to significantly decrease the IIR system response time, especially when artifacts were highly reproducible in consecutive stimulation cycles. The proposed system architecture was hardware-implemented on a field-programmable gate array (FPGA) and tested using two sets of prerecorded neural data from a rat and an *Aplysia californica* (a marine sea slug) obtained from two different laboratories. The measured results from the FPGA verified that the system can indeed remove the stimulus artifacts from the contaminated neural data in real time and recover the neural action potentials that occur on the tail end of the artifact (as close as within 0.5 ms after the artifact spike). The root-mean-square (rms) value of the pre-processed stimulus artifact was reduced on average by a factor of 17 (*Aplysia californica*) and 5.3 (rat) post-processing. Details of the IIR SAR algorithm, its FPGA implementation and testing with prerecorded neural datasets are reported in a manuscript currently in press with the *IEEE Transactions on Biomedical Circuits and Systems* (see Appendix I).

## **Key Research Accomplishments**

- Develop a neural signal-processing algorithm for real-time stimulus artifact rejection
- Implement the algorithm in hardware on an FPGA for real-time operation
- Prepare and submit a manuscript to *IEEE Trans. Biomedical Circuits and Systems*. The paper is accepted and currently in press.

## **Reportable Outcomes**

### 1- Manuscripts/Abstracts/Presentations:

- D. J. Guggenmos, M. Azin, S. Barbay, J. D. Mahnken, C. Dunham, P. Mohseni, and R. J. Nudo, "Restoration of function after brain damage using a neural prosthesis," *Proc. Natl. Acad. Sci. USA (PNAS)*, in press.
- K. Limnuson, H. Lu, H. J. Chiel, and P. Mohseni, "Real-time stimulus artifact rejection via template subtraction," *IEEE Trans. Biomed. Circuits and Systems*, in press.
- D. J. Guggenmos, C. Dunham, M. Azin, S. Barbay, J. D. Mahnken, P. Mohseni, and R. J. Nudo, "Neurophysiological effects of activity-dependent stimulation following a controlled cortical impact to primary motor cortex of the rat," *Program No.* 79.12, 2013 Neuroscience Meeting Planner, San Diego, CA, Society for Neuroscience, November 2013. Online.
- D. J. Guggenmos, M. Azin, S. Barbay, P. Mohseni, and R. J. Nudo, "Activity-dependent stimulation drives functional recovery after traumatic brain injury in the rat," *Program No. 682.16*, 2012 Neuroscience Meeting Planner, New Orleans, LA, Society for Neuroscience, October 2012. Online.
- 2- Patents and Licenses Applied for/Issued: None issued yet.
- **3- Degrees Obtained from Award:** None yet.
- 4- Development of Cell Lines and Tissue/Serum Repositories: Not applicable.
- 5- Infomatics (Databases and Animal Models): None yet.
- **6- Funding Applied for:** None yet.
- 7- Employment/Research Opportunities Applied for/Received: None yet.

## **Conclusion**

Rapid progress is being made toward developing smart prosthetic platforms for altering plasticity in the injured brain, leading to future therapeutic interventions for TBI that are guided by the underlying mechanisms

for long-range functional and structural plasticity in the cerebral cortex. An unprecedented, potent effect of activity-dependent stimulation (ADS) on motor performance has been demonstrated in rats with TBI. Statistical analysis of the data is complete and includes both un-implanted and open-loop stimulation control groups. Post-hoc physiological data demonstrate rapid establishment of functional connectivity between the two areas. Efforts are currently focused on developing a revised microsystem that would enable the investigation of the safety and efficacy of this approach in a non-human primate model of TBI. In parallel, we have also established the feasibility of hardware implementation of a neural signal-processing algorithm for real-time elimination of stimulus artifacts that can potentially increase the amount of conditioning performed by the microsystem between the two cortical regions.

Appendix I K. Limnuson, H. Lu, H. J. Chiel, and P. Mohseni, "Real-time stimulus artifact rejection via template subtraction," *IEEE Trans. Biomed. Circuits and Systems*, in press.

# Real-Time Stimulus Artifact Rejection Via Template Subtraction

Kanokwan Limnuson, Student Member, IEEE, Hui Lu, Hillel J. Chiel, and Pedram Mohseni, Senior Member, IEEE

Abstract—This paper presents an infinite impulse response (IIR) temporal filtering technique for real-time stimulus artifact rejection (SAR) based on template subtraction. A system architecture for the IIR SAR algorithm is developed, and the operation of the algorithm with fixed-point computation is analyzed to obtain the number of bits for the internal nodes of the system, considering dynamic range and fraction length requirements for optimum performance. Further, memory initialization with the first recorded stimulus artifact is proposed and shown to significantly decrease the IIR system response time, especially when artifacts are highly reproducible in consecutive stimulation cycles. The proposed system architecture is hardware-implemented on a field-programmable gate array (FPGA) and tested using two sets of prerecorded neural data from a rat and an Aplysia californica (a marine sea slug) obtained from two different laboratories. The measured results from the FPGA verify that the system can indeed remove the stimulus artifacts from the contaminated neural data in real time and recover the neural action potentials that occur on the tail end of the artifact (as close as within 0.5 ms after the artifact spike). The root-mean-square (rms) value of the pre-processed stimulus artifact is reduced on average by a factor of 17 (Aplysia californica) and 5.3 (rat) post-processing.

Index Terms—Closed-loop neuroprostheses, field-programmable gate array (FPGA), neural recording, neurostimulation, stimulus artifact rejection, template subtraction.

### I. INTRODUCTION

TIMULUS ARTIFACT REJECTION (SAR) is important in biopotential recording, whenever stimulation is performed in the same medium in which the recording electrodes are also placed [1]. This is because the large stimulus artifacts can corrupt or mask the neural activity of interest, either hindering the analysis of stimulus-evoked recorded data [1], or limiting the efficacy of activity-dependent stimulation for

Manuscript received February 24, 2013; revised May 31, 2013; accepted July 14, 2013. This work was supported by the Department of Defense Traumatic Brain Injury—Investigator-Initiated Research Award Program under Award W81XWH-10-1-0741 (to P. Mohseni) and National Institutes of Health Grant NS047073 (to H. J. Chiel). This paper was recommended by Associate Editor E. M. Drakakis.

- K. Limnuson is with the Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH 44106 USA.
- H. Lu and H. J. Chiel are with the Department of Biology, Case Western Reserve University, Cleveland, OH 44106 USA (e-mail: hjc@case.edu).
- P. Mohseni is with the Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH 44106 USA, and also with the Advanced Platform Technology (APT) Center—A Veterans Affairs (VA) Research Center of Excellence, Cleveland, OH 44106-1702 USA (e-mail: pedram.mohseni@case.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TBCAS.2013.2274574

closed-loop operation [2], [3]. Many SAR techniques have been developed in the past that use the same fundamental principles for rejection, and the choice of a particular method is typically dependent on the type of biopotential that is being recorded and the conditions under which the recording is taking place [4]–[7].

The two primary classes of SAR techniques are the so-called blanking and subtraction techniques. There are also some other techniques that do not readily fit into one of these two categories [8], [9]. Blanking techniques essentially disconnect the input of the recording amplifier during stimulation. Stimulation-synchronized blanking can be achieved by several methods, including grounding the amplifier input [10], [11], connecting the amplifier input to its output or to that of a sample-and-hold circuit [12], [13], digitally replacing the contaminated signal during the artifact interval with an estimate of the uncontaminated signal [14], and using high-speed auto-zeroing to maintain the amplifier output constant during stimulation [15]. In general, blanking techniques are relatively simple, effective for rejecting large stimulus artifacts, practical for preventing amplifier saturation, and inherently amenable to hardware implementation for real-time SAR. The major drawback is that recording is not viable during stimulation.

Subtraction techniques basically subtract a template signal representative of the stimulus artifacts from the contaminated neural data to remove the artifacts. These techniques do not prevent amplifier saturation on their own and often necessitate running a digital signal processing (DSP) algorithm, rendering them much more complex than the blanking techniques. The major advantage is that these techniques make it possible to retain signal information during stimulation.

Generating an accurate template signal has been the main focus of research in subtraction-based SAR techniques and can be achieved by several methods, including artifact modeling based on locally fitted cubic polynomials [5], capturing the artifact from subthreshold stimulation or from a second recording site remote from the stimulation site [1], and temporal averaging of the contaminated data for multiple consecutive stimulation cycles [16], [17], with the underlying assumption that the overall shape, dynamic range, and timing (e.g., latency with respect to the stimulus timing signal) of the stimulus artifacts do not significantly vary with time.

Subtraction techniques have the potential to fully eliminate the artifacts from the contaminated data record, but have to rely on the generation of an accurate template signal for subtraction, which in turn necessitates an adjustment in the recording amplifier gain or stimulus intensity to enable non-saturated recording of the full-scale stimulus artifact. On the other hand, providing a low-impedance discharge path for the stimulation electrode using active feedback circuitry [18], [19], as well as careful design of the stimulator in terms of isolation of stimulation channels and parasitic current injection [20] have been previously shown to decrease the duration and amplitude of otherwise-saturating stimulus artifacts. But these approaches cannot fully eliminate the artifacts on their own, suggesting that an optimal solution might be to combine them with the subtraction techniques.

Since subtraction techniques typically require a DSP algorithm for the generation of the template signal, they have traditionally been implemented offline on a home-base computer post-data acquisition. To execute a subtraction-based SAR algorithm in real time (i.e., as the recording is taking place), a suitable template-generation technique should be selected and optimized, realized in hardware, and tested with real neural data, paving the way for ultimately implementing it on a custom integrated circuit (IC).

We have previously assessed the feasibility of hardware implementation of a subtraction-based SAR algorithm using the well-established finite impulse response (FIR) and infinite impulse response (IIR) temporal filtering techniques for template generation [21]. Using MATLAB<sup>TM</sup> simulations, both implementations were shown to be capable of removing stimulus artifacts upon reaching steady-state, with the IIR architecture offering a more favorable tradeoff among performance, computational resources, and power consumption at the expense of its operation speed.

This paper presents our work on hardware implementation of the IIR system proposed in [21] for a real-time SAR algorithm based on template subtraction. The paper is organized as follows. Section II describes the SAR algorithm and the corresponding IIR system architecture, and Section III analyzes its dynamic range and fraction length requirements to determine the number of bits for the internal nodes of the system in fixed-point computation. Section IV describes the implementation of the IIR SAR algorithm on a field-programmable gate array (FPGA), and Section V presents the measured FPGA results using two prerecorded neural datasets. Finally, Section VI draws some conclusions from this work.

### II. SAR ALGORITHM

To generate a template signal representative of the stimulus artifact, temporal filtering is employed in which several properly shifted versions of the input neural data containing the stimulus artifacts are averaged. This is represented by [21]

$$y(t) = \sum_{n=0}^{N-1} a(n) \cdot x(t - nT_{sti})$$
 (1)

where y(t) is the estimated template signal, x(t) is the input neural data containing the stimulus artifacts, N is the number of stimulus artifact waveforms used for template estimation, a(n) are averaging factors that should sum up to unity for the stimulus artifact and y(t) to have the same amplitude (e.g., a(n) factors can be all equal to 1/N for standard averaging), and  $T_{sti}$  is the stimulation period. It should be noted that the stimulation occurrence does not necessarily have to be periodic for correct



Fig. 1. System architecture for the IIR implementation of the template subtraction-based SAR algorithm. The number of bits in internal operation of the algorithm is also shown.

operation of the SAR algorithm, as long as it is predictable via a stimulus timing signal.

An FIR implementation of (1) was previously shown to require at least N-1 memory rows and N summations in each period of the sampling clock, whereas the IIR implementation would require a single memory row and only three summations at the expense of much longer system response time [21]. Initializing the memory with the first recorded artifact can significantly decrease the IIR system response time for creating an accurate artifact template signal [22]. Therefore, this paper focuses on the IIR implementation of the SAR algorithm with memory initialization.

Fig. 1 depicts the system architecture, comprising neural-recording front-end circuitry for signal conditioning and a DSP unit for executing the SAR algorithm. The recording front-end provides ac amplification, dc input stabilization, bandpass filtering, and 10b digitization of the recorded neural signal with fully programmable gain and bandwidth, similar to what has previously been shown in [3]. The DSP unit, which is the focus of this paper, provides additional highpass filtering using an IIR digital filter with adjustable bandwidth to remove any residual dc offsets or low-frequency noise, and performs real-time stimulus artifact rejection using template subtraction. Based on Fig. 1:

$$y_n = (1 - K) \cdot y_{n-1} + K \cdot x_n \tag{2}$$

where  $y_n$  is the new artifact template signal,  $y_{n-1}$  is the previous template signal, and  $x_n$  is the input neural data. Therefore, in the IIR implementation, the stimulus artifact template signal is retained in the memory, and a new template signal is generated from the previous template signal and the input neural data according to (2), which is then subtracted from the input neural data. The factor K (<1) plays a similar role to N in (1), affecting the IIR system response time and accuracy. As shown in the Appendix, it can be derived from (2) that the minimum number of stimulus artifacts, m, required to generate an accurate template signal with error less than, e.g., 0.1% is

$$m > \frac{-3 - \log_{10}|1 - Y_0|}{\log_{10}(1 - K)} \tag{3}$$

where  $Y_0$  is the initial condition of the memory normalized to the steady-state artifact template signal. Fig. 2 shows a plot of m versus  $Y_0$  for four different values of K. Clearly, the closer the



Fig. 2. Minimum number of stimulus artifacts required to generate an accurate template signal with error <0.1% as a function of the normalized initial condition of the memory  $(Y_0 < 1)$  for four different values of K.

initial condition is to the steady-state template signal, the faster the system response time, showing that the IIR implementation is particularly effective when stimulus artifacts in consecutive stimulation cycles are reproducible. In this work, the factor K is selected to be either 1/16 or 1/32, which also allows implementing the multiplication-by- K function via a shift to the right by 4b or 5b, respectively, obviating the need for digital multipliers.

It is worth noting that the artifact template generation technique in (2) performed by the proposed IIR system is in essence an exponentially weighted moving average (EWMA) [23], a statistic tool with a rich history in process monitoring and quality control charting [24], [25] as well as economics [26] and industrial quality control [27]. In this paper, we utilize a real-time implementation of the EWMA for a novel application in neural signal processing. Section III discusses the performance of the IIR SAR algorithm with fixed-point computation and provides a framework for determining the optimum number of bits in internal operation of the algorithm.

### III. SAR ALGORITHM WITH FIXED-POINT COMPUTATION

When template calculations are performed with floating-point precision, similar to when the SAR algorithm is executed offline in MATLAB<sup>TM</sup> on a home-base computer post-data acquisition, the output can be very accurate. However, for *real-time* execution of the algorithm in hardware, fixed-point computation is preferred for simplicity, which then raises concerns about the template signal accuracy due to quantization noise. In this section, we find the optimum number of bits in internal operation of the SAR algorithm by analyzing the dynamic range and fraction length requirements.

In IIR systems, the internal nodes of the structure can potentially overflow, necessitating an adjustment in their dynamic range to satisfy the L1-norm criteria for preventing an overflow [28]–[30]. In Fig. 1, consider the signal path from the input neural data (i.e.,  $x_n$ ) to each of the four internal nodes of the



Fig. 3. L1-norm estimates at nodes #1-4 for the two selected values of K.

algorithm (i.e., nodes #1–4). Assume the resulting transfer functions and corresponding impulse responses are  $F_i(z)$  and  $f_i[n]$ , respectively. Modeling the memory block as a unit delay, it can be shown that

$$F_{1}(z) = K$$

$$F_{2}(z) = \frac{K}{1 - (1 - K)z^{-1}}$$

$$F_{3}(z) = \frac{Kz^{-1}}{1 - (1 - K)z^{-1}}$$

$$F_{4}(z) = \frac{K(1 - K)z^{-1}}{1 - (1 - K)z^{-1}}.$$
(4)

Fig. 3 depicts the L1-norm estimates of the four transfer functions for the two selected values of K, where L1-norm is

$$||f||_1 = \sum_{n=0}^{\infty} |f[n]|.$$
 (5)

As can be seen in all cases, the L1-norm estimates are less than one, indicating that no additional bits (equal to  $\log_2 ||f||_1$ ) are needed beyond 10b for the internal nodes to avoid overflow. The SAR algorithm output node  $(z_n = x_n - y_n)$  has higher dynamic range of 11b to prevent the saturation of the output after subtraction, in case of an overflow/underflow.

Next, to assess the impact of quantization noise induced by fixed-point computation on template signal accuracy, we determine the signal-to-noise ratio (SNR) in template signal generation as a function of the fraction length for the internal nodes (i.e., number of additional bits beyond 10b in a word-length). Fig. 4 shows the simulation structure for comparing the performance of the SAR algorithm with fixed-point computation versus that with floating-point computation by determining the SNR [31].  $Q_1$  and  $Q_2$  are two quantizers that quantize their inputs to the word-length value, whereas  $Q_3$  quantizes its input to 10b. Fig. 5 depicts the simulated SNR and effective number of



Fig. 4. Simulation structure for determining the SNR in template signal generation.



Fig. 5. Simulated SNR and ENOB of template signal generation in the IIR SAR algorithm with fixed-point computation versus the fraction length for the two selected values of K.

bits, ENOB, in template signal generation for the two selected values of K, where the SNR is defined as

$$SNR = 20 \log_{10} \frac{S_{\text{out,rms}}}{N_{Q,\text{rms}}}$$
 (6)

with  $S_{\text{out}}$  and  $N_Q$  representing the reference output and quantization noise, respectively. Input  $x_n$  is taken to be a 10b-digitized sinusoidal signal with rail-to-rail amplitude (i.e., -512 to 511 in two's complement format) and a frequency of 0.1 mHz to capture the underlying assumption that the stimulus artifacts do not change rapidly with time. Assuming a stimulation frequency of 1 Hz,  $x_n$ 's sampling frequency is also 1 Hz. Clearly, the system requires a fraction length of 5b to achieve  $\sim 10b$  accuracy in template signal generation with K=1/32. A lower fraction length would increase the quantization noise and degrade the accuracy to <10b, whereas a higher fraction length not only would increase the requisite hardware resources to support larger memory size, but also would not offer any significant benefit given that by design the overall system performance would be limited by that of the neural-recording front-end [3], and not the DSP unit. Taking into account these considerations related to dynamic range and fraction length requirements, the selected number of bits for the internal operation of the SAR algorithm is shown in Fig. 1.

### IV. FPGA IMPLEMENTATION

The DSP unit in Fig. 1, comprising the digital highpass filter (HPF) and the SAR algorithm circuitry, has been implemented on an FPGA using the DE2 Development and Educational Board, which has the Cyclone II device by *Altera* as its FPGA platform. Fig. 6 depicts the architecture of the DSP unit in FPGA implementation, which incorporates a 68b parameter register, a digital control unit, and a DSP core. The parameter register is used to store the user-selectable parameters for system operation such as the bandwidth setting of the digital HPF and factor K in the SAR algorithm, as well as memory initialization, memory length, and output-blanking settings. The memory length (i.e., number of 16b samples) is determined by the sampling clock frequency and the stimulus artifact duration. If needed, the blanking feature is used after template subtraction to remove any residual artifacts in the output around the rising and falling edges of the artifact where it rapidly changes with time [21]. The parameter register is implemented as a standalone circuit block with its own timing and control operation, which is separate from that of the other circuit blocks and applied externally. This is because this block is loaded with the requisite system parameters only once prior to the experiment and is not synchronously clocked with the rest of the circuit during SAR algorithm operation.

The digital control unit incorporates counters and finite-state machines and provides timing, path, and blanking control signals for the DSP core. The required inputs for the digital control unit include a stimulus timing signal, system clock and sampling clock signals, and system parameters such as memory length, memory initialization, and blanking settings.

The DSP core incorporates a digital HPF, circuitry to execute the SAR algorithm, and parallel-to-serial converters at the output. The required inputs for the DSP core include the amplified/digitized neural signal (10b), system clock signal, and control signals provided by the digital control unit. Fig. 6 also shows the structure of the digital HPF and SAR algorithm circuitry in the DSP core as implemented on the FPGA. The amplified/digitized input neural signal is first highpass filtered using a 1st-order, IIR filter with direct form II architecture. Factor  $K_1$ is the user-selected HPF coefficient that controls the filter bandwidth and is selected judiciously to perform the filtering using arithmetic shifts, subtraction and addition only, with no need for digital multipliers or dividers [3]. The user can set  $K_1$  to be either 1/16 or 1/8, which results in a filter cutoff frequency of 366 Hz or 756 Hz, respectively, from a 1-MHz system clock. Since the digitized data at the analog-to-digital converter (ADC) output are unsigned numbers (10b), a factor of 512 is subtracted from the input signal to convert it to two's complement format for further processing. In addition, an overflow/underflow detector is used at the HPF output to limit its dynamic range to 10b before feeding it to the SAR algorithm circuitry.

The SAR algorithm only operates for the duration of each stimulus artifact. The digitized/filtered sample at the output of the HPF filter (10b) is first converted to 15b via a shift to the left by 5b and then multiplied by factor  $K_2$  (same as K in Fig. 1) stored in the parameter register. Next, the memory data containing the previous template signal are read, multiplied by



Fig. 6. Architecture of the DSP unit (top) and structure of the digital HPF and SAR algorithm circuitry in the DSP core (bottom) as implemented on the FPGA.

 $(1-K_2)$ , and added to  $(K_2 \cdot x_n)$  to obtain the new template signal (15b), which is written back into the memory for the next cycle. The new template signal is also converted back to 10b and subsequently subtracted from the 10b digitized/filtered input sample to produce the SAR algorithm output signal. Outside the duration of the stimulus artifact, the SAR algorithm circuitry is disabled and the digitized/filtered sample at the HPF output is directly passed to the output register.

The path control signal from the digital control unit manages the memory initialization. Specifically, if the recorded stimulus artifact is the first artifact, indicated as such by the stimulus timing signal, the path control signal routes the 15b sample directly to the memory input for its initialization. With the next indication of stimulation by the stimulus timing signal, the IIR system executes the SAR algorithm as previously described. If the memory initialization setting is not enabled by the user, the memory can be cleared to start with zero internal values, but this would increase the IIR system response time as previously shown in Fig. 2.

The 16b, 4K memory is implemented using the internal SRAM of the FPGA. Even parity is used to check for memory error, which is generated by an *XOR* function of all the bits in each 15b sample. The parity bit is then added to the end of the data bits before being written into the memory as a 16b sample. When the memory data are read out, a parity checker checks for memory error, and this information is sent to the output. The 15b sample is also sent to the rest of the SAR algorithm circuitry for template generation. Including the memory parity check feature, while not entirely necessary for an FPGA-based system, would streamline the design translation from an FPGA to an IC platform in the future.

The blanking control signal, which is also received from the digital control unit, is used to remove any residual artifacts in the output after template subtraction. Specifically, this control signal activates a multiplexer that replaces the output data with "0" for the time period in which blanking is applied, which is normally at the rising and falling edges of the artifact where it

rapidly changes with time. The user can independently set the blanking duration around the rising and falling edges from 0 (i.e., no blanking) to 2,047 data points.

The three registers in Fig. 6 are used for pipelining in order to overlap the processing in each stage and prevent harmful race conditions with proper timing control. Further, since the SAR algorithm circuitry operates synchronously with a system clock, all circuit blocks (except the parameter register) share the same system clock signal globally and use a local *Enable* signal for synchronization [32].

### V. FPGA MEASUREMENT RESULTS

The DSP unit as depicted in Fig. 6 has been synthesized and mapped to the Cyclone II FPGA, EP2C35F672C6, using *Altera*'s Quartus II design software. The mapped circuitry consumed 2% (656) of the total available logic elements (LEs) and 14% (65,536) of the total available memory bits. The DE2 board was programmed and connected to a digital data acquisition (DAQ) card, NI 6541, which provided the input signal to the FPGA and recorded the output waveforms. The system clock was applied to the FPGA using the onboard external clock port, and a supply of 9 V was used to power up the board with its input-output (I/O) ports at 3.3 V. For all FPGA measurements described below, factors  $K_1$  and  $K_2$  (see Fig. 6) were both set to 1/16.

Two sets of prerecorded neural data from two different laboratories were used to experimentally verify the operation of the IIR SAR algorithm and its FPGA implementation. Specifically, a 294-s window of prerecorded neural data from a rat was used as the first dataset. The rat data were sampled at  $\sim$ 24.4 kHz and obtained during 4-Hz cortical stimulation. A gain of 520 ( $\sim$ 54.3 dB) was applied to the neural data before feeding it to the FPGA. The SAR algorithm was set to operate for 5 ms upon receiving an indication of stimulation by the stimulus timing signal, and no output blanking was applied.

A 125-s window of prerecorded data from an *Aplysia californica* (a marine sea slug) was used as the second neural dataset.



Fig. 7. FPGA measurement results using prerecorded neural data from a laboratory rat. (a) Top plot shows a 294-s window of the input data to the FPGA. Middle plot depicts the generated stimulus artifact template signal, whereas the bottom plot shows the IIR system output from the FPGA. Two 5-ms snapshots of the waveforms are shown at (b) t = 208 s and (c) t = 256 s.

The *Aplysia* data were sampled at 2 kHz and obtained during 0.5-Hz stimulation. A gain of 1,000 (60 dB) was applied to the neural data before feeding it to the FPGA. Upon receiving an indication of stimulation by the stimulus timing signal, the SAR algorithm was set to operate for 96 ms (the duration of stimulus artifact in the *Aplysia* dataset was much longer than that in the rat dataset), and output blanking was set to occur for 4 ms synchronized with the rising and falling edges of the stimulus timing signal. The applied gain values represented those previously obtained with our neural-recording front-end operating from 1.5 V [3]. The gain values were high enough to achieve sufficient resolution at the DSP unit input, while keeping the amplitude of the amplified neural data below 1.5 V<sub>pp</sub>.

Fig. 7 shows the FPGA measurement results using the rat neural dataset. The top plot in (a) depicts the input neural data to the FPGA, consisting of neural spikes buried in large stimulus artifacts. The middle plot shows the generated artifact template signal after memory initialization as previously described. Note the fast response time of the IIR SAR algorithm in quickly generating the template signal even for the initial stimulus artifacts, as well as how fast the generated template signal tracks the variation in stimulus artifact amplitude in the first 100 seconds. The bottom plot depicts the IIR system output from the FPGA in which the large stimulus artifacts are rejected and the neural data recovered in real time.

Fig. 7(b) and (c) depict 5-ms snapshots of the waveforms at  $t = \sim 208$  s and  $\sim 256$  s, respectively, demonstrating that the system is fully capable of recovering neural action potentials that occur on the tail end of the artifact [see Fig. 7(c)] or appear as close as within 0.5 ms after the artifact spike [see Fig. 7(b)].

The slight discrepancy between the amplitude of the input artifact and that of the template signal is because the template signal actually represents the highpass filtered artifact.

Fig. 8 shows a 5-s snapshot of the waveforms in Fig. 7(a) around the onset of stimulation and their corresponding spectrograms obtained using 1,024-sample windows with 1,000-sample overlap. As can be seen in the top and middle spectrograms, the artifacts in the rat neural dataset have strong frequency components below 5 kHz that are significantly reduced in the output (see the bottom spectrogram), allowing the weaker neural activity to emerge from the large artifacts. For the very first stimulus artifact at just prior to t = 2.5 s, which is the one loaded into the memory for its initialization, the corresponding template signal would be 1/16th of the artifact according to (2), and therefore 15/16th of the artifact appears in the output data after subtraction. The IIR SAR algorithm then removes all the subsequent stimulus artifacts starting with the second one. If present, artifact residuals as seen in Figs. 7(b) and (c) in the time domain and Fig. 8 in the frequency domain (bottom spectrogram) are now insignificant as compared to the neural action potentials.



Fig. 8. A 5-s snapshot of the FPGA measurement results using the prerecorded rat neural dataset and their corresponding spectrograms. The 5-s snapshot is taken around the stimulus onset.



Fig. 9. FPGA measurement results using the prerecorded *Aplysia* neural dataset and their corresponding spectrograms.

Fig. 9 shows the FPGA measurement results using the *Aplysia* neural dataset and their corresponding spectrograms. The top plot depicts the input neural data to the FPGA, containing many large stimulus artifacts that occur at 0.5 Hz and bursts of extracellular neural activity that occur in between and occasionally on the tail end of the artifacts. The middle plot shows the generated artifact template signal and its spectrogram, indicating that the artifacts have their frequency components spread throughout a 1-kHz bandwidth with strong frequency components contained below 200 Hz. The bottom plot shows the FPGA output data after blanking in which all



Fig. 10. Top plot shows a 96-ms portion of the *Aplysia* neural dataset, showing a total of 61 unfiltered stimulus artifacts superimposed on each other with some action potentials riding on the tail end of the artifacts. Middle plot depicts the 61 stimulus artifact templates superimposed on each other, which actually represent the highpass filtered artifacts (not shown). Bottom plot shows the artifact-free FPGA output in which the neural spikes are recovered after template subtraction. Residual artifacts are also simultaneously removed after 4-ms blanking (arrows). Note the smaller dynamic range of the Y-axis in the bottom plot after artifact removal and residual blanking.

stimulus artifacts (minus the first one as explained previously) are successfully removed from the recorded data in real time to recover the neural activity.

Fig. 10 shows a close-up view of the waveforms during the 96-ms period of operation for the SAR algorithm. The top plot depicts 61 unfiltered stimulus artifacts superimposed on each other (i.e., all the artifacts present in the 125-s window of *Aplysia* neural dataset minus the very first one), with some action potentials also occurring on the tail end of the artifacts. The middle plot shows the corresponding artifact templates superimposed on each other, whereas the bottom plot depicts the artifact-free IIR system output from the FPGA after template subtraction and 4-ms blanking (arrows) for simultaneous removal of the artifacts and artifact residuals, respectively, demonstrating successful operation of the algorithm and its hardware implementation.

In order to assess the performance of the IIR SAR algorithm and its hardware implementation in a quantitative manner, a total of 908 stimulus artifacts (54 of 62 and 854 of 1,000 artifacts in the *Aplysia* and rat neural datasets, respectively) were analyzed. Specifically, the mean and standard deviation of the root-mean-square (rms) values of the artifacts were computed pre- and post-processing by the FPGA.

The analysis excluded the very first artifact in each neural dataset and those artifacts that had action potentials present anywhere in their duration over which the algorithm was operating (96 ms and 5 ms for the *Aplysia* and rat artifacts, respectively). This ensured that the occasional presence of action potentials did not confound the analysis. The same statistics were also obtained from segments of the FPGA output that represented pure noise (i.e., absence of both action potentials and artifact residuals). Table I tabulates the results of this analysis. In the case of *Aplysia* neural dataset that contains relatively stationary stimulus artifacts (see the top plot in Fig. 9 and note the small standard deviation value in Table I), the rms value of the artifact on average is reduced by a factor of 17, resulting in post-processed rms values that are at the level of that for the output

TABLE I STATISTICS OF PRE- AND POST-PROCESSED STIMULUS ARTIFACTS

| Aplysia californica (54 of 62 SAs) |                           |                   |  |  |  |  |  |
|------------------------------------|---------------------------|-------------------|--|--|--|--|--|
|                                    | Mean (μV <sub>rms</sub> ) | $SD(\mu V_{rms})$ |  |  |  |  |  |
| Pre-Processing                     | 68.33                     | 1.21              |  |  |  |  |  |
| Post-Processing                    | 4.01                      | 0.68              |  |  |  |  |  |
| Output Noise                       | 3.83                      | 0.16              |  |  |  |  |  |
| Rat (854 of 1,000 SAs)             |                           |                   |  |  |  |  |  |
| Pre-Processing                     | 115.74                    | 21.72             |  |  |  |  |  |
| Post-Processing                    | 21.65                     | 16.70             |  |  |  |  |  |
| Output Noise                       | 5.03                      | 0.46              |  |  |  |  |  |



Fig. 11. Root-mean-square (rms) value of the stimulus artifacts (854 of 1,000) in the rat neural dataset pre- and post-processing by the FPGA. The dashed line represents an average rms value of 5.03  $\mu$ V for the output noise obtained from 10 different 5-ms segments that did not contain any action potentials or artifact residuals.

noise. In the case of rat neural dataset that contains both stationary and non-stationary artifacts (see the top plot in Fig. 7(a) and note the larger standard deviation value in Table I), the reduction in the rms value on average is more modest (a factor of 5.3). A closer look at the rms values of individual stimulus artifacts pre- and post-processing reveals that the degradation of performance is limited to when there is a sudden change in the artifacts (see Fig. 11 and compare its trend with how the artifacts are changing in the top plot of Fig. 7(a)), whereas the rms values of the post-processed artifacts indeed approach that of the output noise when the artifacts are relatively stationary.

### VI. CONCLUSION

This paper reported on a neural signal-processing algorithm for real-time stimulus artifact rejection (SAR) in which a high-fidelity template signal representative of the stimulus artifacts was first generated via temporal filtering and subsequently subtracted from the contaminated neural data to remove the artifacts. A system architecture for the IIR implementation of the algorithm was realized in hardware on an FPGA platform, featuring memory initialization as a simple method to significantly decrease the IIR system response time for accurate template generation. The measured FPGA results using two sets of prerecorded neural data from a rat and an *Aplysia californica* verified the functionality of the algorithm and its hardware implementation by removing the stimulus artifacts in real time from the contaminated recorded data and recovering the extracellular neural activity.

The major advantage of this approach as compared to the blanking techniques (i.e., disconnecting the recording amplifier input during stimulation) is that it has the potential to retain signal information during stimulation while fully eliminating the artifacts from the contaminated data record in real time. On the other hand, one limitation of this approach is that it does not directly address the problem of amplifier saturation and hence becomes less effective with prolonged amplifier saturation, unless care is taken in the design of the recording and stimulating circuitry to prevent (or at least minimize) amplifier saturation by decreasing the duration and amplitude of the artifacts [18]–[20]. Another limitation of this approach is that if neural activity occurs on the tail end of the artifact and is time-locked to stimulation, it will be removed by the system along with the artifacts. Similarly, if neural activity occurs during the rising/falling edges of the artifact spike, it will be lost, because it will be either blanked out by the system or heavily distorted by the residuals with no blanking.

This technique can potentially handle other stimulation scenarios as well, given that it only needs the stimulus timing signal information for correct operation. For example, if stimulation occurs *simultaneously* on two electrodes, a *combined* stimulus artifact might appear on the recording electrode that can be removed even by the current system. If stimulation occurs *alternately* on two electrodes, two different stimulus artifact types might appear *alternately* as well on the recording electrode and can be removed by modifying the timing operation of the system to handle each artifact type independently, if there is no temporal overlap between the artifacts. Ultimately, a tradeoff exists between functional versatility and system operation complexity.

Finally, given the relatively low system clock frequency of  $\leq 1$  MHz in this work and that the synthesized algorithm utilized a very small percentage of the available FPGA resources, it was not readily feasible to accurately determine the power consumption in hardware implementation. Efforts are currently under way for custom implementation of the DSP unit in Fig. 1 on an IC that would also incorporate recording front-end and stimulating back-end circuitry adapted from [3] to form a complete system. To that end, our preliminary work shows that the DSP unit can be implemented with a total area of 3.64 mm² (89% occupied by the 16b, 4K SRAM) in 0.35- $\mu$ m CMOS technology with power consumption on the order of low-tens of microwatts from 1.5 V (1-MHz system clock), indicating the feasibility of running the algorithm on a miniaturized, integrated device in the near future.

### APPENDIX

In this Appendix, we show the derivation of (3) in Section II: SAR Algorithm. As previously stated, based on Fig. 1

$$y_n = (1 - K) \cdot y_{n-1} + K \cdot x_n \tag{A1}$$

where  $n = 1, 2, 3, \dots$  Hence, it is simple to see that

$$y_1 = (1 - K) \cdot y_0 + K \cdot x_1$$
  

$$y_2 = (1 - K) \cdot y_1 + K \cdot x_2$$
  

$$= (1 - K)^2 \cdot y_0 + K \cdot (1 - K) \cdot x_1 + K \cdot x_2 \quad (A2)$$

which means that the template signal for the mth artifact can be written as

$$y_m = (1 - K)^m \cdot y_0 + K \cdot [x_m + (1 - K) \cdot x_{m-1} + \dots + (1 - K)^{m-1} \cdot x_1]$$
 (A3)

where  $y_0$  is the initial condition of the memory. Assume that  $x_1, x_2, \dots x_m$  are all equal to the steady-state artifact template signal,  $y_{ss}$ . Therefore

$$\frac{y_m}{y_{ss}} = (1 - K)^m \cdot Y_0 + K \cdot [1 + (1 - K) + \dots + (1 - K)^{m-1}]$$
 (A4)

where  $Y_0 = (y_0)/(y_{ss})$  is the initial condition of the memory normalized to the steady-state artifact template signal. Given the sum of geometric series, it can be shown that

$$1 + (1 - K) + \dots + (1 - K)^{m-1}$$

$$= \frac{1 - (1 - K)^m}{1 - (1 - K)}$$

$$= \frac{1 - (1 - K)^m}{K}$$
(A5)

which means that (A4) can be simplified to

$$\frac{y_m}{y_{es}} = (1 - K)^m \cdot Y_0 + 1 - (1 - K)^m.$$
 (A6)

If  $Y_0 < 1$ , for generating an accurate template signal with error less than, e.g., 0.1%, one needs to have  $(y_m)/(y_{ss}) > 0.999$ , which means  $(1-K)^m.(1-Y_0) < 0.001$  from (A6). Taking a logarithm of both sides and noting that  $\log_{10}(1-K) < 0$ , one can obtain

$$m > \frac{-3 - \log_{10}(1 - Y_0)}{\log_{10}(1 - K)}.$$
 (A7)

If  $Y_0 > 1$ , for generating an accurate template signal with error less than 0.1%, one needs to have  $(y_m)/(y_{ss}) < 1.001$ , which ultimately leads to

$$m > \frac{-3 - \log_{10}(Y_0 - 1)}{\log_{10}(1 - K)}.$$
 (A8)

### ACKNOWLEDGMENT

The authors would like to thank Dr. D. Guggenmos and Prof. R. Nudo, Kansas University Medical Center, Kansas City, KS, USA, for providing the prerecorded rat neural dataset. The authors would also like to thank Dr. M. Azin, QualComm, San Diego, CA, USA, and Prof. M. Buchner, Case Western Reserve

University, Cleveland, OH, USA, for helpful discussions that made this work possible.

#### REFERENCES

- [1] K. C. McGill *et al.*, "On the nature and elimination of stimulus artifact in nerve signals evoked and recorded using surface electrodes," *IEEE Trans. Biomed. Eng.*, vol. BME-29, no. 2, pp. 129–137, Feb. 1982.
- [2] M. Azin, D. J. Guggenmos, S. Barbay, R. J. Nudo, and P. Mohseni, "A miniaturized system for spike-triggered intracortical microstimulation in an ambulatory rat," *IEEE Trans. Biomed. Eng.*, vol. 58, no. 9, pp. 2589–2597, Sep. 2011.
- [3] M. Azin, D. J. Guggenmos, S. Barbay, R. J. Nudo, and P. Mohseni, "A battery-powered activity-dependent intracortical microstimulation IC for brain-machine-brain interface," *IEEE J. Solid-State Circuits*, vol. 46, no. 4, pp. 731–745, Apr. 2011.
- [4] B. H. Boudreau, K. B. Englehart, A. D. C. Chan, and P. A. Parker, "Reduction of stimulus artifact in somatosensory evoked potentials: Segmented versus subthreshold training," *IEEE Trans. Biomed. Eng.*, vol. 51, no. 7, pp. 1187–1195, Jul. 2004.
- [5] D. A. Wagenaar and S. M. Potter, "Real-time multichannel stimulus artifact suppression by local curve fitting," *J. Neurosci. Methods*, vol. 120, no. 2, pp. 113–120, Oct. 2002.
- [6] H. Liang and Z. Lin, "Stimulus artifact cancellation in the serosal recordings of gastric myoelectric activity using wavelet transform," *IEEE Trans. Biomed. Eng.*, vol. 49, no. 7, pp. 681–688, Jul. 2002.
- [7] V. Parsa, P. A. Parker, and R. N. Scott, "Adaptive stimulus artifact reduction in noncortical somatosensory evoked potential studies," *IEEE Trans. Biomed. Eng.*, vol. 45, no. 2, pp. 165–179, Feb. 1998.
- [8] A. Demosthenous, J. Taylor, I. F. Triantis, R. Rieger, and N. Donaldson, "Design of an adaptive interference reduction system for nerve-cuff electrode recording," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 51, no. 4, pp. 629–639, Apr. 2004.
- [9] R. Vigario, J. Sarela, V. Jousmaki, M. Hamalainen, and E. Oja, "Independent component approach to the analysis of EEG and MEG recordings," *IEEE Trans. Biomed. Eng.*, vol. 47, no. 5, pp. 589–593, May 2000.
- [10] V. Sharma, D. B. McCreery, M. Han, and V. Pikov, "Bidirectional telemetry controller for neuroprosthetic devices," *IEEE Trans. Neural Syst. Rehabil. Eng.*, vol. 18, no. 1, pp. 67–74, Feb. 2010.
- [11] Z. M. Nikolic, D. B. Popovic, R. B. Stein, and Z. Kenwell, "Instrumentation for ENG and EMG recordings in FES systems," *IEEE Trans. Biomed. Eng.*, vol. 41, no. 7, pp. 703–706, Jul. 1994.
- [12] F. Shahrokhi, K. Abdelhalim, D. Serletis, P. L. Carlen, and R. Genov, "The 128-channel fully differential digital integrated neural recording and stimulation interface," *IEEE Trans. Biomed. Circuits Syst.*, vol. 4, no. 3, pp. 149–161, Jun. 2010.
- [13] H. Jadvar and D. W. Benson, Jr., "A stimulus artifact suppressor for esophageal pacing studies: Design and clinical testing," in *Proc. 11th Annu. Int. IEEE Eng. Med. Biol. Conf.*, 1989, pp. 1401–1402.
- [14] A. E. Hines, P. E. Crago, G. J. Chapman, and C. Billian, "Stimulus artifact removal in EMG from muscles adjacent to stimulated muscles," *J. Neurosci. Methods*, vol. 64, no. 1, pp. 55–62, Jan. 1996.
- [15] G. DeMichele and P. R. Troyk, "Stimulus-resistant neural recording amplifier," in *Proc. 25th Annu. Int. IEEE Eng. Med. Biol. Conf.*, 2003, pp. 3329–3332.
- [16] T. Hashimoto, C. M. Elder, and J. L. Vitek, "A template subtraction method for stimulus artifact removal in high-frequency deep brain stimulation," *J. Neurosci. Methods*, vol. 113, pp. 181–186, 2002.
- [17] T. Wichmann, "A digital averaging method for removal of stimulus artifacts in neurophysiologic experiments," *J. Neurosci. Methods*, vol. 98, pp. 57–62, 2000.
- [18] E. A. Brown, J. D. Ross, R. A. Blum, Y. Nam, B. C. Wheeler, and S. P. DeWeerth, "Stimulus-artifact elimination in a multielectrode system," IEEE Trans. Biomed. Circuits Syst., vol. 2, no. 1, pp. 10–21, Mar. 2008.
- [19] R. A. Blum, J. D. Ross, E. A. Brown, and S. P. DeWeerth, "An integrated system for simultaneous multichannel neuronal stimulation and recording," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 12, pp. 2608–2618, Dec. 2007.
- [20] T. L. Hanson et al., "High-side digitally current-controlled biphasic bipolar microstimulator," *IEEE Trans. Neural Syst. Rehabil. Eng.*, vol. 20, no. 3, pp. 331–340, May 2012.

- [21] M. Azin, H. J. Chiel, and P. Mohseni, "Comparisons of FIR and IIR implementations of a subtraction-based stimulus artifact rejection algorithm," in *Proc. 29th Annu. Int. IEEE Eng. Med. Biol. Conf.*, 2007, pp. 1437–1440.
- [22] K. Limnuson, H. Lu, H. J. Chiel, and P. Mohseni, "FPGA implementation of an IIR temporal filtering technique for real-time stimulus artifact rejection," in *Proc. IEEE Biomed. Circuits and Systems Conf.*, 2011, pp. 49–52.
- [23] J. S. Hunter, "The exponentially weighted moving average," J. Qual. Technol., vol. 18, no. 4, pp. 203–210, Oct. 1986.
- [24] "Monitoring process variability using EWMA," in Springer Handbook of Engineering Statistics, H. Pham, Ed. New York, NY, USA: Springer, 2006, pp. 291–325.
- [25] S. W. Roberts, "Control chart tests based on geometric moving averages," *Technometrics*, vol. 1, no. 3, pp. 239–250, Aug. 1959.
- [26] J. F. Muth, "Optimal properties of exponentially weighted forecasts," J. Amer. Stat. Assoc., vol. 55, pp. 299–306, 1960.
- J. Amer. Stat. Assoc., vol. 55, pp. 299–306, 1960.
   [27] R. A. Freund, "Graphical process control," Indus. Qual. Contr., vol. 18,
- pp. 15–22, 1962.
  [28] A. V. Oppenheim and R. W. Schafer, *Discrete-Time Signal Processing*. Upper Saddle River, NJ, USA: Prentice-Hall, 2010.
- [29] S. K. Mitra, Digital Signal Processing: A Computer-Based Approach. New York, NY, USA: McGraw-Hill, 2006.
- [30] M. Christensen and F. J. Taylor, "Fixed-point-IIR-filter challenges," EDN Netw., vol. 51, no. 23, pp. 111–122, Nov. 2006.
- [31] B. Widrow and I. Kollar, Quantization Noise: Round-Off Error in Digital Computation, Signal Processing, Control, and Communications. New York, NY, USA: Cambridge Univ. Press, 2008.
- [32] P. P. Chu, RTL Hardware Design Using VHDL: Coding for Efficiency, Portability, and Scalability. Hoboken, NJ, USA: Wiley, 2006.



**Kanokwan Limnuson** (S'09) received the B.Eng. degree from Chulalongkorn University, Bangkok, Thailand, and the M.S. degree from Case Western Reserve University (CWRU), Cleveland, OH, USA, both in electrical engineering, in 2004 and 2008, respectively.

Currently, she is working toward the Ph.D. degree in the BioMicroSystems Laboratory at CWRU. Her research interests are the development of neural signal-processing algorithms and their low-power implementation with analog and VLSI circuitry for

bidirectional neural interfaces.



**Hui Lu** was born in 1981. She received the B.S. degree in biology from Nanjing University, Nanjing, China, in 2003.

Currently, she is working toward the Ph.D. degree in biology at Case Western Reserve University, Cleveland, OH, USA. Her research focuses on neural motor control and neuromodulation of the adaptive feeding behaviors in the marine mollusk Aplysia californica. She has authored seven peer-reviewed journal papers.



Hillel J. Chiel received the B.A. degree in English from Yale University, New Haven, CT, USA, and the Ph.D. degree in neural and endocrine regulation from the Massachusetts Institute of Technology (MIT), Cambridge, MA, USA.

After postdoctoral work at the Center for Neurobiology and Behavior at the College of Physicians and Surgeons, Columbia University, New York, NY, USA, and in the Department of Molecular Biophysics at AT&T Bell Laboratories, he joined the faculty of Case Western Reserve University,

Cleveland, OH, USA. Currently, he is a Professor of Biology, with secondary appointments in the Departments of Neurosciences and Biomedical Engineering. His research focuses on the biomechanical and neural mechanisms of adaptive behavior, using the marine mollusk Aplysia californica as a model system. His research has led to the development of novel technology for imaging muscle movements and neural activity in intact animals, and biologically-inspired soft robots. He is a holder of four patents and has authored more than 115 peer-reviewed publications.

Dr. Chiel has been a Fellow of the Institute of Physics since 2004. He served as Guest Coeditor of a special issue on applied neurodynamics for the *Journal of Neural Engineering* with Dr. Peter Thomas in December 2011. He also serves on the editorial boards of the *Journal of Neural Engineering*, *Soft Robotics*, and the *Journal of Visualized Experiments*. In 2012, he won a prize from *Science*, published by the American Association for the Advancement of Science, for inquiry-based education.



**Pedram Mohseni** (S'94–M'05–SM'11) was born in 1974. He received the B.S. degree from the Sharif University of Technology, Tehran, Iran, in 1996, and the M.S. and Ph.D. degrees from the University of Michigan, Ann Arbor, MI, USA, all in electrical engineering, in 1999 and 2005, respectively.

Currently, he is a tenured Associate Professor in the Electrical Engineering and Computer Science Department at Case Western Reserve University, Cleveland, OH, USA, with a secondary appointment in the Biomedical Engineering Department. His

research interests include analog/mixed-signal/RF integrated circuits and microsystems for neural engineering, wireless sensing/actuating systems for brain-machine interfaces, biomedical microtelemetry, and assembly/packaging of biomicrosystems.

Dr. Mohseni has been an Associate Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-PART II (2010–2012), IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS (2008–present), and IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING (2012–present). In addition he was a Guest Editor for the IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS in 2011. He also serves on the Technical Program Committee of the IEEE CICC and RFIC Symposium. He was the recipient of the EECS Faculty Research Award for Exceptional Achievement in 2008, National Science Foundation Career Award in 2009, Case School of Engineering Research Award in 2011, and earned the first-place prize of the Medical Device Entrepreneur's Forum at the 58th annual conference of the ASAIO in 2012. He is a member of the IEEE Solid-State Circuits, Circuits and Systems, and Engineering in Medicine and Biology Societies.